-
Notifications
You must be signed in to change notification settings - Fork 16
Add flags to export metrics and failed wf histories #258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…port failed histories. Exported artifacts are in JSON format
0930588 to
4a4935e
Compare
b969524 to
a082ddf
Compare
| @@ -0,0 +1,192 @@ | |||
| package resources | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole thing feels like more work than necessary. Let's just use Prometheus to record all this and then we can snapshot the DB when it's done. Maybe another option would work too, but let's not write our own custom metrics format and collection
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's just use Prometheus to record all this and then we can snapshot the DB when it's done
Could you clarify a bit here?
AFAICT, we don't spin up a prometheus server to scrape/store our metrics. We just expose an endpoint for prometheus metrics to be scraped from. So there is no collection currently, except in-memory on the client.
These metrics are collected on the worker, and it isn't immediately obvious to me how we'd want to send worker metrics over to the client, to expose on its /metrics endpoint (if we want to do that at all), in tandem with running a prometheus server
So I opted to export this as a file
Maybe another option would work too, but let's not write our own custom metrics format
Yeah, fair. I can replace the current resource tracking with prometheus instrumentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that we can just run a local Prom instance in-lieu of running inside actual infra. If/when we run inside full-on infra-provisioned stuff we can just use a real Prom service, but if we don't have one it's very easy to just spin one up locally and export the stuff from that.
We don't need to send metrics from the worker to the client. Both sides can just expose prom metrics for scraping independently and that's fine.
|
Closing in favor of splitting the PRs, given that there will be a notable difference between the metrics in this PR and the prom PR: |
|
local prom |
What was changed
Add flags to export scenario metrics, export worker metrics, and export failed histories. Exported artifacts are in JSON format. Added options are:
--export-scenario-metrics--export-failed-histories--worker-export-metricsCurrently, the client-side metrics (scenario metrics and export failed histories) are only implemented for throughput stress.
The worker option is only implemented for the Go worker.
Options will no-op on different configurations
Why?
Provides ability to persist outcomes/insights from Omes runs.
Closes
How was this tested:
and
No